Goto

Collaborating Authors

 math library


LeanDojo: Theorem Proving with Retrieval-Augmented Language Models

Neural Information Processing Systems

Large language models (LLMs) have shown promise in proving formal theorems using proof assistants such as Lean. However, existing methods are difficult to reproduce or build on, due to private code, data, and large compute requirements. This has created substantial barriers to research on machine learning methods for theorem proving. This paper removes these barriers by introducing LeanDojo: an open-source Lean playground consisting of toolkits, data, models, and benchmarks. LeanDojo extracts data from Lean and enables interaction with the proof environment programmatically.


NaN-Propagation: A Novel Method for Sparsity Detection in Black-Box Computational Functions

arXiv.org Artificial Intelligence

When numerically evaluating a function's gradient, sparsity detection can enable substantial computational speedups through Jacobian coloring and compression. However, sparsity detection techniques for black-box functions are limited, and existing finite-difference-based methods suffer from false negatives due to coincidental zero gradients. These false negatives can silently corrupt gradient calculations, leading to difficult-to-diagnose errors. We introduce NaN-propagation, which exploits the universal contamination property of IEEE 754 Not-a-Number values to trace input-output dependencies through floating-point numerical computations. By systematically contaminating inputs with NaN and observing which outputs become NaN, the method reconstructs conservative sparsity patterns that eliminate a major source of false negatives. We demonstrate this approach on an aerospace wing weight model, achieving a 1.52x speedup while uncovering dozens of dependencies missed by conventional methods -- a significant practical improvement since gradient computation is often the bottleneck in optimization workflows. The technique leverages IEEE 754 compliance to work across programming languages and math libraries without requiring modifications to existing black-box codes. Furthermore, advanced strategies such as NaN payload encoding via direct bit manipulation enable faster-than-linear time complexity, yielding speed improvements over existing black-box sparsity detection methods. Practical algorithms are also proposed to mitigate challenges from branching code execution common in engineering applications.


LeanDojo: Theorem Proving with Retrieval-Augmented Language Models

Neural Information Processing Systems

Large language models (LLMs) have shown promise in proving formal theorems using proof assistants such as Lean. However, existing methods are difficult to reproduce or build on, due to private code, data, and large compute requirements. This has created substantial barriers to research on machine learning methods for theorem proving. This paper removes these barriers by introducing LeanDojo: an open-source Lean playground consisting of toolkits, data, models, and benchmarks. LeanDojo extracts data from Lean and enables interaction with the proof environment programmatically.


Optimizing Inference Performance of Transformers on CPUs

arXiv.org Artificial Intelligence

This paper comes to address this gap by presenting an empirical analysis of scalability and performance of inferencing Transfomerbased The Transformer architecture revolutionized the field of natural models on CPUs. We identify the key component of the language processing (NLP). Transformers-based models (e.g., BERT) Transformer architecture where the bulk of the computation happens, power many important Web services, such as search, translation, namely, the matrix multiplication (matmul) operations, and question-answering, etc. While enormous research attention is paid propose three optimizations to speed them up. to the training of those models, relatively little efforts are made The first optimization is based on the observation that the performance to improve their inference performance. This paper comes to address of the matmul operation is heavily impacted not only this gap by presenting an empirical analysis of scalability by the shape (dimensions) of the source matrices and the available and performance of inferencing a Transformer-based model on computing resources (the number of worker threads), but also by CPUs.


Introduction to Numpy - A Math Library for Python

#artificialintelligence

In this article I'm just going to introduce you to the basics of what is mostly required for machine learning and datascience. I'm not going to cover everything that's possible with numpy library. This is the part one of numpy tutorial series.


Stan Biweekly Roundup, 6 October 2017

#artificialintelligence

Jonah Gabry returned from teaching a one-week course for a special EU research institute in Spain. Mitzi Morris has been knocking out bug fixes for the parser and some pull requests to refactor the underlying type inference to clear the way for tuples, sparse matrices, and higher-order functions. Michael Betancourt with help from Sean Talts spent last week teaching an intro course to physicists about Stan. Charles Margossian attended and said it went really well. Ben Goodrich, in addition to handling a slew of RStan issues has been diving into the math library to define derivatives for Bessel functions. Aki Vehtari has put us in touch with the MxNet developers at Amazon UK and we had our first conference call with them to talk about adding sparse matrix functionality to Stan (Neil Lawrence is working there now).


Intel's Optimized Tools and Frameworks for Machine Learning and Deep Learning

#artificialintelligence

Machine learning (ML) is a subset of the more general field of artificial intelligence (AI). ML is based on a set of algorithms that learn from data. Deep learning (DL) is a specialized ML technique that is based on a set of algorithms that attempt to model high-level abstractions in data by using a graph with multiple processing layers (https://en.wikipedia.org/wiki/Deep_learning). ML, and in particular DL, are currently used in a growing number of applications and industries, including image and video recognition/classification, face detection, natural language processing, and financial forecasting and prediction. A convenient way to work with DL is to use the Intel's optimized ML and DL frameworks. Using Intel optimized tools and frameworks to train and deploy deep networks guarantees that these tools will use Intel architecture in the most efficient way.